ad performance
Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion
Tripathi, Kumud, Kumar, Chowdam Venkata, Wasnik, Pankaj
V oice Activity Detection (V AD) plays a key role in speech processing, often utilizing hand-crafted or neural features. This study examines the effectiveness of Mel-Frequency Cepstral Coefficients (MFCCs) and pre-trained model (PTM) features, including wav2vec 2.0, HuBERT, WavLM, UniSpeech, MMS, and Whisper. We propose FusionV AD, a unified framework that combines both feature types using three fusion strategies: concatenation, addition, and cross-attention (CA). Experimental results reveal that simple fusion techniques, particularly addition, outperform CA in both accuracy and efficiency. Fusion-based models consistently surpass single-feature models, highlighting the complementary nature of MFCCs and PTM features. Notably, our best-performing fusion model exceeds the state-of-the-art Pyannote across multiple datasets, achieving an absolute average improvement of 2.04%. These results confirm that simple feature fusion enhances V AD robustness while maintaining computational efficiency.
AdParaphrase v2.0: Generating Attractive Ad Texts Using a Preference-Annotated Paraphrase Dataset
Murakami, Soichiro, Zhang, Peinan, Kamigaito, Hidetaka, Takamura, Hiroya, Okumura, Manabu
Identifying factors that make ad text attractive is essential for advertising success. This study proposes AdParaphrase v2.0, a dataset for ad text paraphrasing, containing human preference data, to enable the analysis of the linguistic factors and to support the development of methods for generating attractive ad texts. Compared with v1.0, this dataset is 20 times larger, comprising 16,460 ad text paraphrase pairs, each annotated with preference data from ten evaluators, thereby enabling a more comprehensive and reliable analysis. Through the experiments, we identified multiple linguistic features of engaging ad texts that were not observed in v1.0 and explored various methods for generating attractive ad texts. Furthermore, our analysis demonstrated the relationships between human preference and ad performance, and highlighted the potential of reference-free metrics based on large language models for evaluating ad text attractiveness. The dataset is publicly available at: https://github.com/CyberAgentAILab/AdParaphrase-v2.0.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Kumamoto Prefecture > Kumamoto (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
AD-LLM: Benchmarking Large Language Models for Anomaly Detection
Yang, Tiankai, Nian, Yi, Li, Shawn, Xu, Ruiyao, Li, Yuangang, Li, Jiaqi, Xiao, Zhuo, Hu, Xiyang, Rossi, Ryan, Ding, Kaize, Hu, Xia, Zhao, Yue
Anomaly detection (AD) is an important machine learning task with many real-world uses, including fraud detection, medical diagnosis, and industrial monitoring. Within natural language processing (NLP), AD helps detect issues like spam, misinformation, and unusual user activity. Although large language models (LLMs) have had a strong impact on tasks such as text generation and summarization, their potential in AD has not been studied enough. This paper introduces AD-LLM, the first benchmark that evaluates how LLMs can help with NLP anomaly detection. We examine three key tasks: (i) zero-shot detection, using LLMs' pre-trained knowledge to perform AD without tasks-specific training; (ii) data augmentation, generating synthetic data and category descriptions to improve AD models; and (iii) model selection, using LLMs to suggest unsupervised AD models. Through experiments with different datasets, we find that LLMs can work well in zero-shot AD, that carefully designed augmentation methods are useful, and that explaining model selection for specific datasets remains challenging. Based on these results, we outline six future research directions on LLMs for AD.
- Information Technology (0.67)
- Health & Medicine (0.66)
- Law Enforcement & Public Safety > Fraud (0.48)
- Energy > Oil & Gas (0.34)
A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data
Ulmer, Markus, Zgraggen, Jannik, Huber, Lilach Goren
Anomaly detection (AD) tasks have been solved using machine learning algorithms in various domains and applications. The great majority of these algorithms use normal data to train a residual-based model and assign anomaly scores to unseen samples based on their dissimilarity with the learned normal regime. The underlying assumption of these approaches is that anomaly-free data is available for training. This is, however, often not the case in real-world operational settings, where the training data may be contaminated with an unknown fraction of abnormal samples. Training with contaminated data, in turn, inevitably leads to a deteriorated AD performance of the residual-based algorithms. In this paper we introduce a framework for a fully unsupervised refinement of contaminated training data for AD tasks. The framework is generic and can be applied to any residual-based machine learning model. We demonstrate the application of the framework to two public datasets of multivariate time series machine data from different application fields. We show its clear superiority over the naive approach of training with contaminated data without refinement. Moreover, we compare it to the ideal, unrealistic reference in which anomaly-free data would be available for training. The method is based on evaluating the contribution of individual samples to the generalization ability of a given model, and contrasting the contribution of anomalies with the one of normal samples. As a result, the proposed approach is comparable to, and often outperforms training with normal samples only.
- North America > United States (0.28)
- Europe > United Kingdom (0.14)
- Europe > Switzerland (0.14)
- Europe > Germany (0.14)
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
- Energy > Oil & Gas > Midstream (0.46)
Adversarial Anomaly Detection using Gaussian Priors and Nonlinear Anomaly Scores
Lüer, Fiete, Weber, Tobias, Dolgich, Maxim, Böhm, Christian
Anomaly detection in imbalanced datasets is a frequent and crucial problem, especially in the medical domain where retrieving and labeling irregularities is often expensive. By combining the generative stability of a $\beta$-variational autoencoder (VAE) with the discriminative strengths of generative adversarial networks (GANs), we propose a novel model, $\beta$-VAEGAN. We investigate methods for composing anomaly scores based on the discriminative and reconstructive capabilities of our model. Existing work focuses on linear combinations of these components to determine if data is anomalous. We advance existing work by training a kernelized support vector machine (SVM) on the respective error components to also consider nonlinear relationships. This improves anomaly detection performance, while allowing faster optimization. Lastly, we use the deviations from the Gaussian prior of $\beta$-VAEGAN to form a novel anomaly score component. In comparison to state-of-the-art work, we improve the $F_1$ score during anomaly detection from 0.85 to 0.92 on the widely used MITBIH Arrhythmia Database.
- Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination
Yu, Jongmin, Oh, Hyeontaek, Kim, Minkyung, Kim, Junsik
In this paper, we propose Normality-Calibrated Autoencoder (NCAE), which can boost anomaly detection performance on the contaminated datasets without any prior information or explicit abnormal samples in the training phase. The NCAE adversarially generates high confident normal samples from a latent space having low entropy and leverages them to predict abnormal samples in a training dataset. NCAE is trained to minimise reconstruction errors in uncontaminated samples and maximise reconstruction errors in contaminated samples. The experimental results demonstrate that our method outperforms shallow, hybrid, and deep methods for unsupervised anomaly detection and achieves comparable performance compared with semi-supervised methods using labelled anomaly samples in the training phase. The source code is publicly available on https://github.com/andreYoo/
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
PadSquad Deploys the Iguazio Data Science Platform
Iguazio, the data science platform built for production and real-time machine learning applications, announced it has been deployed by mobile software company PadSquad, to improve the relevance and performance of the digital campaigns they run for their customers worldwide. PadSquad is revolutionizing traditional media with interactive features and innovative technologies that transform the audiences' experience and engagement with ad creatives. Iguazio was deployed by PadSquad to use AI to improve ad performance and reduce media costs for their customers. They do this by ingesting and acting upon real-time events – from contextual content on the page, engagement with creative elements like video views, swipeable panels, and hot spots, to the season and time of day – at a rate of over 3,000 events per second. Utilizing online and offline behavioral data from multiple sources, available to them through third-party platforms and their own internal tools, Padsquad can now harness machine learning to optimize ad performance and provide a better and more personalized user experience for their customers' audiences.
A Deep Prediction Network for Understanding Advertiser Intent and Satisfaction
Guo, Liyi, Lu, Rui, Zhang, Haoqi, Jin, Junqi, Zheng, Zhenzhe, Wu, Fan, Li, Jin, Xu, Haiyang, Li, Han, Lu, Wenkai, Xu, Jian, Gai, Kun
For e-commerce platforms such as Taobao and Amazon, advertisers play an important role in the entire digital ecosystem: their behaviors explicitly influence users' browsing and shopping experience; more importantly, advertiser's expenditure on advertising constitutes a primary source of platform revenue. Therefore, providing better services for advertisers is essential for the long-term prosperity for e-commerce platforms. To achieve this goal, the ad platform needs to have an in-depth understanding of advertisers in terms of both their marketing intents and satisfaction over the advertising performance, based on which further optimization could be carried out to service the advertisers in the correct direction. In this paper, we propose a novel Deep Satisfaction Prediction Network (DSPN), which models advertiser intent and satisfaction simultaneously. It employs a two-stage network structure where advertiser intent vector and satisfaction are jointly learned by considering the features of advertiser's action information and advertising performance indicators. Experiments on an Alibaba advertisement dataset and online evaluations show that our proposed DSPN outperforms state-of-the-art baselines and has stable performance in terms of AUC in the online environment. Further analyses show that DSPN not only predicts advertisers' satisfaction accurately but also learns an explainable advertiser intent, revealing the opportunities to optimize the advertising performance further.
- Europe > Ireland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Marketing (1.00)
- Information Technology > Services > e-Commerce Services (0.69)
Seamless Organic and Paid Search Integration Strategy
Paid and organic search work together like peanut butter and jelly: on their own they are okay, but together they make magic! When it comes to PPC vs. SEO, there is only one strategy winner: integrating both. The key to any optimization strategy lies in the data. In this post, we will quickly take you through linking your Google Ads and Search Console accounts and accessing the right reports to access the insights of both, to optimize your organic and paid search strategy for peak performance. In short, following these three short steps will give you additional organic data in your AdWords reports, which you can then use to improve your organic reach as well as your search ad performance.
- Marketing (0.79)
- Information Technology > Services (0.60)
Effective Social Media Ads: How to Leverage AI-Assisted Creativity
The power of social media continues to transform how brands engage their target audiences. It's also placed a burden on them to be "always on." And brands, for the most part, are responding to customer demand. Some 42% of customer service responses made through Facebook, for instance, are answered in the first 60 minutes. Marketers are still looking for solutions that allow them to create, test, and choose the most effective social media ads--and AI is making it possible.